Printf-style debug of where workflow startup error go during queries... #932
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What was changed
When evaluating a query task, but the workflow code no longer accepts the input payload of the StartWorkflowEvent, the resulting error is currently dropped without any logs whatsoever (even with
EnableVerboseLogging
). The list of possible queries against that workflow is then empty (as query handlers would be registered at the start of the workflow code), leading to unexpectedly vanishing queries without any explanation.This PR outlines the data flow (just follow the
Printf
, 1.HERE
, 2.XXX
, 3.YYY
, 4.ZZZ
) and adds a warning for all workflow results ending in error, which will also be visible when the evaluated task was just for a query. This is likely too noisy and should be adapted further.Why?
If workflow input types are changed and queries stop working, there should be some indication in the logs that this is due to incompatible workflow code versions instead of silence.
Checklist
Related to: No problem indicators on failed workflow input deserialization during queries #933
How was this tested:
We (inadvertently) updated workflow definition code such that the input payload of older executions was no longer able deserialized to the input type of the new workflow code. Then we tried to execute queries (via UI) on the old workflow. However, even the defined query types were gone from the dropdown. We observed no logs indicating any error conditions, neither on the temporal services nor on the workflow worker evaluating the query task.
We used
go.mod
replace
directive to recompile the workflow worker against the source branch of this PR and observed the new warning message when workflow initialization failed due to failures during input deserialization.